Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU runner for linux-aarch64 #289

Merged
merged 10 commits into from
Dec 13, 2024
Merged

Conversation

leofang
Copy link
Member

@leofang leofang commented Dec 12, 2024

No description provided.

@leofang leofang added P0 High priority - Must do! CI/CD CI/CD infrastructure labels Dec 12, 2024
@leofang leofang self-assigned this Dec 12, 2024
Copy link

copy-pr-bot bot commented Dec 12, 2024

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang
Copy link
Member Author

leofang commented Dec 12, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 12, 2024

/ok to test

@leofang leofang requested a review from sandeepd-nv December 12, 2024 01:56
ksimpson-work
ksimpson-work previously approved these changes Dec 12, 2024
Copy link
Contributor

@ksimpson-work ksimpson-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit so long as we get the runner issues sorted, and rerun the CI before submission in case the problem spans multiple runner platforms

@leofang
Copy link
Member Author

leofang commented Dec 12, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 12, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

It seems on aarch64 we hit this issue... actions/setup-python#961

Copy link

copy-pr-bot bot commented Dec 13, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

/ok to test

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

/ok to test

Comment on lines -89 to +91
image: condaforge/miniforge3:latest
image: ubuntu:22.04
Copy link
Member Author

@leofang leofang Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup-python is being very restrictive, as it purposely locks in to Ubuntu:

(And for some reason, the miniforge3 container can still work w/ setup-python on linux-64 but not on linux-aarch64...)

Copy link
Collaborator

@jakirkham jakirkham Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added ARM support earlier this year: conda-incubator/setup-miniconda#331

So when you have a moment, could you please raise the Miniforge GHA specific issue

Edit: Or perhaps this is as simple as switching to conda-incubator/setup-miniconda

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the question is "why don't you use conda to set up a Python environment?" the answer is it's on our TODO list: #280. Contribution is more than welcome 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw the issue flagged above and tried to help convert this into an upstream issue or provide a path forward

@leofang
Copy link
Member Author

leofang commented Dec 13, 2024

All runners are green, let me merge.

@leofang leofang merged commit f1267cd into NVIDIA:main Dec 13, 2024
30 checks passed
@leofang leofang deleted the add_arm_runner branch December 13, 2024 04:09
@leofang leofang added this to the cuda.core beta 2 milestone Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure P0 High priority - Must do!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants